Using Sparse Classification Outputs as Feature Observations for Noise-robust ASR

نویسندگان

Yang Sun

Bert Cranen

Jort F. Gemmeke

Louis ten Bosch

Lou Boves

Mathew Magimai-Doss

چکیده

Sparse Classification (SC) is an exemplar-based approach to Automatic Speech Recognition. By representing noisy speech as a sparse linear combination of speech and noise exemplars, SC allows separating speech from noise. The approach has shown its robustness in noisy conditions, but at the cost of degradation in clean conditions. In this work, rather than using the state probability estimates obtained with SC directly in a Viterbi decoding, the probability distributions of SC are modeled by Gaussian Mixture Models (GMMs), for which purpose we introduce a novel whitening transformation. Results on the AURORA2 task show that our proposed approach is especially effective in clean speech and in the matched noise conditions in test set A. Except in the -5 dB SNR condition we also find substantial improvements in the non-matched noise conditions in test set B.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Noise-Robust Texture Classification Method Using Joint Multiscale LBP

In this paper we describe a novel noise-robust texture classification method using joint multiscale local binary pattern. The first step in texture classification is to describe the texture by extracting different features. So far, several methods have been developed for this topic, one of the most popular ones is Local Binary Pattern (LBP) method and its variants such as Completed Local Binary...

متن کامل

Noise robust ASR in reverberated multisource environments applying convolutive NMF and Long Short-Term Memory

This article proposes and evaluates various methods to integrate the concept of bidirectional Long Short-Term Memory (BLSTM) temporal context modeling into a system for automatic speech recognition (ASR) in noisy and reverberated environments. Building on recent advances in Long Short-Term Memory architectures for ASR, we design a novel front-end for contextsensitive Tandem feature extraction a...

متن کامل

Exploring Low-Dimensional Structures of Modulation Spectra for Robust Speech Recognition

Developments of noise robustness techniques are vital to the success of automatic speech recognition (ASR) systems in face of varying sources of environmental interference. Recent studies have shown that exploring low-dimensional structures of speech features can yield good robustness. Along this vein, research on low-rank representation (LRR), which considers the intrinsic structures of speech...

متن کامل

Face Recognition using an Affine Sparse Coding approach

Sparse coding is an unsupervised method which learns a set of over-complete bases to represent data such as image and video. Sparse coding has increasing attraction for image classification applications in recent years. But in the cases where we have some similar images from different classes, such as face recognition applications, different images may be classified into the same class, and hen...

متن کامل

Robust Speech and Bird Song Processing using Multi-band Correlograms and Sparse Representations

of the Dissertation Robust Speech and Bird Song Processing using Multi-band Correlograms and Sparse Representations by Lee Ngee Tan Doctor of Philosophy in Electrical Engineering University of California, Los Angeles, 2014 Professor Abeer Alwan, Chair This dissertation focuses on algorithms for robust speech and bird song processing. Many applications perform well under ideal signal conditions,...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2012

Using Sparse Classification Outputs as Feature Observations for Noise-robust ASR

نویسندگان

چکیده

منابع مشابه

A Novel Noise-Robust Texture Classification Method Using Joint Multiscale LBP

Noise robust ASR in reverberated multisource environments applying convolutive NMF and Long Short-Term Memory

Exploring Low-Dimensional Structures of Modulation Spectra for Robust Speech Recognition

Face Recognition using an Affine Sparse Coding approach

Robust Speech and Bird Song Processing using Multi-band Correlograms and Sparse Representations

عنوان ژورنال:

اشتراک گذاری